Biological Pattern Discovery with R Machine Learning Approaches (Zheng Rong Yang)

Fig. 3.23. An illustration of the introduction of the momentum term.

der to improve the learning quality of a MLP model, the above

pdate rule has been altered by introducing the momentum term

art, et al., 1986], which is based on the update of the model

rs in the previous learning cycle. The momentum term can

y prevent over-update of model parameters. The use of the

m term is shown below, where 0 ൏ߙ൏1 is called the

m factor,

Δܟ^௧ାଵൌെߟ^׏ߝ^௧

׏ܟ^௧^൅ߙΔܟ^௧

(3.36)

above equation, Δܟ^௧ stands for the update of w at the learning

nd Δܟ^௧ାଵ stands for the update of w at the learning cycle t + 1. It

een that the update term ߟ׏ߝ^௧׏ܟ^௧

⁄

and the momentum term

ve different signs, hence different directions. Whenever the term

ܟ^௧ goes too far, the term ߙΔܟ^௧ will pull the move backward

Therefore the momentum term can reduce the oscillation

y and prevent the potential move from a wrong direction so as to

saddle point on the error function curve. As shown in Figure 3.23,

move from ݔ଴ to ݔଷ, which will make the move to the saddle point

other advanced approach for improving the learning capability is

f the second order derivative, such as the Hessian matrix [Bishop,